35. Final Exam

Part 1

Note: Very trivial questions are left out, and only those I find difficult or might be helpful to note down with explanation or just for calculation purposes are tackled here.

Question 4

image.png

In [1]:
%load_ext tikzmagic
In [25]:
from graphviz import Digraph
from helpers import update_samecoin_graph

def draw_samecoin_graph(g, n_flips=2, highlight_nodes=[], color='yellow'):  

    g.attr(rankdir='LR', ranksep='0.5')
    g.attr('node', shape='circle', fontsize='10')
    g.attr('edge', fontsize='10')

    g.node('Root','R')
    g.node('H0','H') # Fair coin
    g.node('T0','T') # loaded coin

#   print('noding')
    i_outcome = 1
    for each_flip in range(1,n_flips):
        n_outcomes = 2**each_flip
        for each_outcome in range(0, n_outcomes):            
            new_H = 'H{}'.format(i_outcome) 
            new_T = 'T{}'.format(i_outcome)             
            g.node(new_H, 'H')
            g.node(new_T, 'T')                        
            i_outcome += 1 

    # choose F or L 
    g.edge('Root','H0',label='0.7')
    g.edge('Root','T0',label='0.3')

    # flip 1 of H/T (F or L is not considered a flip)
    g.edge('H0','H1',label='0.5')
    g.edge('H0','T1',label='0.5')
    g.edge('T0','H2',label='0.5')
    g.edge('T0','T2',label='0.5')            

    for each_node in highlight_nodes:
        #print(each_node)
        g.node(each_node,style='filled',fillcolor=color)

    return g


g = Digraph()
g = draw_samecoin_graph(g)  # hardcoded for this problem now
# Root - Choosing between Fair or Lodged Coin
g = update_samecoin_graph(g, highlight_nodes=['Root'],color='#33EA4C:#A2E9FF')

g = update_samecoin_graph(g, highlight_nodes=['H0','T1'],color='#A2E9FF')

g
Out[25]:
%3 Root R H0 H Root->H0 0.7 T0 T Root->T0 0.3 H1 H H0->H1 0.5 T1 T H0->T1 0.5 H2 H T0->H2 0.5 T2 T T0->T2 0.5

Answer: $P(H_l)p(T_l) = (0.7)(0.5) = 0.35$

Question 5

image.png

There are 3 possibilities as shown below. Therefore,

In [31]:
g = Digraph()
g = draw_samecoin_graph(g)  # hardcoded for this problem now
# Root - Choosing between Fair or Lodged Coin
g = update_samecoin_graph(g, highlight_nodes=['Root'],color='#33EA4C:#A2E9FF')

g = update_samecoin_graph(g, highlight_nodes=['H0','T1'],color='#A2E9FF')
g = update_samecoin_graph(g, highlight_nodes=['H0','H1'],color='#A2E9FF')
g = update_samecoin_graph(g, highlight_nodes=['T0','H2'],color='#A2E9FF')

g
Out[31]:
%3 Root R H0 H Root->H0 0.7 T0 T Root->T0 0.3 H1 H H0->H1 0.5 T1 T H0->T1 0.5 H2 H T0->H2 0.5 T2 T T0->T2 0.5

$$ P( \text{H in any flip} ) = \dfrac{(0.7)(0.5) + (0.7)(0.5) + (0.3)(0.5)}{(0.7)(0.5) + (0.7)(0.5) + (0.3)(0.5) + (0.3)(0.5)} = 0.85 $$

In [26]:
0.7*0.5+0.7*0.5+0.3*0.5
Out[26]:
0.85

Question 6

image.png

In [50]:
def draw_samecoin_graph(g, n_flips=2, highlight_nodes=[], color='yellow'):

    g.attr(ranksep='0.5')
    g.attr('node', shape='circle', fontsize='10')
    g.attr('edge', fontsize='10')

    g.node('Root','R')
    g.node('H0','H') # Fair coin
    g.node('T0','T') # loaded coin

#   print('noding')
    i_outcome = 1
    for each_flip in range(1,n_flips):
        n_outcomes = 2**each_flip
        for each_outcome in range(0, n_outcomes):            
            new_1 = '1{}'.format(i_outcome) 
            new_2 = '2{}'.format(i_outcome)             
            new_3 = '3{}'.format(i_outcome) 
            new_4 = '4{}'.format(i_outcome) 
            new_5 = '5{}'.format(i_outcome)             
            new_6 = '6{}'.format(i_outcome)             
            g.node(new_1, '1')
            g.node(new_2, '2')                        
            g.node(new_3, '3')                        
            g.node(new_4, '4')
            g.node(new_5, '5')                        
            g.node(new_6, '6')                                    
            i_outcome += 1 

    # choose F or L 
    g.edge('Root','H0',label='0.5')
    g.edge('Root','T0',label='0.5')

    g.edge('H0','11',label='0.166')
    g.edge('H0','21',label='0.166')
    g.edge('H0','31',label='0.166')
    g.edge('H0','41',label='0.166')
    g.edge('H0','51',label='0.166')
    g.edge('H0','61',label='0.166')
    g.edge('T0','12',label='0.125')
    g.edge('T0','22',label='0.125')
    g.edge('T0','32',label='0.125')
    g.edge('T0','42',label='0.125')
    g.edge('T0','52',label='0.125')
    g.edge('T0','62',label='0.125')    
    g.edge('T0','7',label='0.125')
    g.edge('T0','8',label='0.125')    


    for each_node in highlight_nodes:
        #print(each_node)
        g.node(each_node,style='filled',fillcolor=color)

    return g


g = Digraph()
g = draw_samecoin_graph(g)  # hardcoded for this problem now
# Root - Choosing between Fair or Lodged Coin
g = update_samecoin_graph(g, highlight_nodes=['Root'],color='#33EA4C:#A2E9FF')

g = update_samecoin_graph(g, highlight_nodes=['H0','61'],color='#A2E9FF')
g = update_samecoin_graph(g, highlight_nodes=['T0','62'],color='#A2E9FF')

g
Out[50]:
%3 Root R H0 H Root->H0 0.5 T0 T Root->T0 0.5 11 1 H0->11 0.166 21 2 H0->21 0.166 31 3 H0->31 0.166 41 4 H0->41 0.166 51 5 H0->51 0.166 61 6 H0->61 0.166 12 1 T0->12 0.125 22 2 T0->22 0.125 32 3 T0->32 0.125 42 4 T0->42 0.125 52 5 T0->52 0.125 62 6 T0->62 0.125 7 7 T0->7 0.125 8 8 T0->8 0.125

As can be seen above, there are 2 cases possible for getting 6. So

$$ P( \text{getting 6} ) = (0.5)(0.166) + (0.5)(0.125) = 0.145 $$

In [51]:
(0.5)*(0.166) + (0.5)*(0.125)
Out[51]:
0.14550000000000002

Question 7

In same case above, what is $P( \text{heads} | 6)$? As we just saw, there are totally 2 cases of getting 6, which has probability of 0.145. Out of which, one case has heads. So

$$ P(\text{heads} | 6) = \dfrac{(0.5)(0.166)}{0.145} = 0.572 $$

In [52]:
0.5*0.166/0.145
Out[52]:
0.5724137931034483

Question 8

image.png

Out of $n=7$ days, we want to know the probability of $r=2$ days having rain. Comparing an analogy to coin flip, the no of flips is to no of days, and each outcome is whether we get rain or not. Thus its a binomial distribution.

$$ P(\text{rain for 2 days}) = \binom{n}{r}p^r(1-p)^{n-r} = \binom{7}{2}(0.2)^2(1-0.2)^{7-2} = 0.275 $$

In [54]:
21*((0.2)**2)*((0.8)**5)
Out[54]:
0.27525120000000014

Question 9

image.png

Answer is just add individual probabilities for different r as below. $$ P(X \geq x) = P(X \geq 2) = P(X=2) + P(X=3) + P(X=4) + P(X=5) + P(X=6) + P(X=7) = 0.423 $$

In [68]:
from math import sqrt

n, p, q = 7 ,0.2, 1 - 0.2
r = 2

from scipy.stats import binom
tp = 0
for i in range(2,8):
    tp += binom.pmf(i, n, p)
tp
Out[68]:
0.4232832000000003

Part 2

Question 10

What is the z score?

$$ \mu = 100, \ \ \sigma = 15, \ \ X = 130 \\ Z = \dfrac{X - \mu}{\sigma} = 2 $$

Question 11

What is the distribution of distance from initial to final position?

image.png

$$ E(X - Y) = E(X) - E(Y) = 10 - 5 = 5 \\ Var(X-Y) = Var(X) + Var(Y) = \sigma_X^2 + \sigma_Y^2 = 1^2 + (0.5)^2 \\ \sigma_{X-Y} = \sqrt{Var(X-Y)} = 1.12 $$

In [4]:
from math import sqrt
s_r = sqrt(1**2 + 0.5**2)
s_r
Out[4]:
1.118033988749895

Question 12

image.png

Ans:

$$ E(aX) = aE(X) = 2.54(70) = 177.8 \\ Var(aX) = a^2Var(X) = (2.54)^2(25) = 161.29 \\ $$

In [5]:
m = 2.54*70
v = ((2.54)**2)*25
m,v
Out[5]:
(177.8, 161.29)

Question 13

Note carefully. They are asking CI for the probability.

image.png

For a single Bernoulli trial, we could thus have an estimate as below from the 10000 trials.

$$ \hat{p} = \dfrac{4950}{10000} = 0.4950 \\ \overline{x} = \hat{p} = 0.4950 \\ s = \sqrt{pq} = \sqrt{(0.495)(1-0.495)} = 0.4999 $$

In [47]:
p = 4950/10000
m = p
s = sqrt( 0.4950*(1-0.495)  )
s
Out[47]:
0.49997499937496875

Calculating Critical Value $z_{\frac{\alpha}{2}}$

If confidence level is 90\%, then significance level $\alpha$ is 10\%, thus respective Z value would be 1.645

In [8]:
def get_z(cl):
    #NOTE:returns right tailed area as that is mostly used in CI
    from scipy import stats
    alpha = round((1 - cl)/2,3)
    return (-1)*round(stats.norm.ppf(alpha),3)  # right tailing..

cl = 0.90
print(get_z(cl))
1.645

Calculating CI

Since we are repeating for $n=10000$ trials, we expect a sampling distribution as below. Calculating CI for the same,

$$\begin{aligned} CI &= \overline{x} \pm z_{\frac{\alpha}{2}}\dfrac{s}{\sqrt{n}} \\ &= 0.4950 \pm 1.645\dfrac{0.4999}{\sqrt{10000}} \\ &= 0.4950 \pm 1.645(0.004999) \\ &= (0.4867, 0.5032) \end{aligned}$$

In [50]:
0.4950 - 1.645*0.004999, 0.4950 + 1.645*0.004999
Out[50]:
(0.486776645, 0.503223355)
In [46]:
%matplotlib inline
import matplotlib.pyplot as plt
# from normalviz import draw_normal
import numpy as np
import matplotlib.mlab as mlab
import math
def draw_normal(ax, mu, sigma, cond=''):
    """
    cond: to shade the area meeting the condition
    """
    xstart = mu - 4*sigma
    xend = mu + 4*sigma
    x = np.linspace(xstart, xend, 100)
    y = mlab.normpdf(x, mu, sigma)
    ax.plot(x,y, color='black')

    # shade area satisfying the condition
    w = x[eval(cond)] if cond != '' else x
    w_shade = mlab.normpdf(w, mu, sigma)
    ax.fill_between(w, 0, w_shade)

    # set x axis in multiples of sigma
    x_ticks = []
    for step in range(-4,5): # 4 sigma on right, 4 on left, mu on middle
        x_tick = round(mu + (step)*sigma,2)
        x_ticks.append(x_tick)        
    ax.xaxis.set_ticks(x_ticks)
    ax.grid(True,  linestyle='--',alpha=0.5)

    ax.set_ylim(ymin=0) 

mu = 0.4950
sigma = 0.004999


# plot
fig, ax = plt.subplots(1,1, figsize=(7,4))
draw_normal(ax, mu, sigma, 'x<0.4867')  
draw_normal(ax, mu, sigma, 'x>0.5032')  
ax.set_xlabel('No of flips')
ax.set_ylabel('Probability that they are heads')
plt.show()

Question 14

image.png

In [52]:
x = [0.79,0.70,0.73,0.66,0.65,0.70,0.74,0.81,0.71,0.70]

n = len(x)
xb = sum(x)/n
v = sum([ (i - xb)**2 for i in x ] )/n
s =sqrt(v)
xb, s
Out[52]:
(0.719, 0.048259714048054625)
In [53]:
se = 1.96*(s/sqrt(n))
xb - se, xb + se
Out[53]:
(0.6890883193384256, 0.7489116806615743)

Question 15

Calculate slope and y-intercept for given data.

image.png

In [54]:
x = [0,1,2]
y = [0,2,2]

# means
n = len(x)  # also could use len(Y) as its pairs
x_b, y_b = sum(x)/n, sum(y)/n

b_1 = sum([(i[0] - x_b)*(i[1] - y_b) for i in zip(x,y)])/ sum([(i - x_b)**2 for i in x])
b_0 = y_b - b_1*x_b

b_0, b_1
Out[54]:
(0.33333333333333326, 1.0)

Question 16

Rank the $r$ from 1 to 4

image